Alignment Proposals
10 pages tagged "Alignment Proposals"
Can you give an AI a goal which involves “minimally impacting the world”?
Could governmental investments help with AI alignment?
What is AI Safety via Debate?
What is "HCH"?
What is Ought's research agenda?
What is reinforcement learning from human feedback (RLHF)?
What is Iterated Distillation and Amplification (IDA)?
What is prosaic alignment?
What is shard theory?
What is "Constitutional AI"?